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(57) Abstract 



A fully automated modular analytical system integrates instrumentation to permit analysis of biopolymer samples, such as nucleic 
acids, proteins, peptides and carbohydrates. The system integrates analytical methods of detection and analysis, such as mass spectrometry, 
radiolabcling, mass tags, chemical tags, fluorescence and chemi luminescence, with robotic technology and automated chemical reaction 
systems to provide a high-throughput, accurate automated system for high throughput analyses. 
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AUTOMATED PROCESS LINES 
RELATED APPLICATIONS 

Benefit of priority to U.S. application Serial No. 09/285,481 to Hubert 
Koster, Ping Yip, Jhobe Steadrnan, Dirk Reuter and Richard MacDonald, filed 
5 April 2, 1999 f entitled "AUTOMATED PROCESS LINE" is claimed. Where 

permitted, the subject matter of this application is incorporated by reference in 
its entirety. 

FIELD OF THE INVENTION 

Provided herein are systems and methods using the systems for 

10 performing high throughput analyses of biopolymers. 
BACKGROUND OF THE INVENTION 

In recent years, developments in the field of life sciences have proceeded 
at a breathtaking rate. Ground breaking scientific discoveries and advances in 
such fields as genomics (sequencing and characterization of genetic information 

15 and analysis of the relationship between gene activity and cell function) and 
proteomics (systematic analysis of protein expression in tissues, cells, and 
biological systems) promise to reshape the fields of medicine, agriculture, and 
environmental science. The success of these efforts depends, in part, on the 
development of sophisticated laboratory tools that will automate and expedite 

20 the testing and analysis of biological samples. 

Current methods of testing typically employ multiple instruments for 
preparing and analyzing samples and involve multiple manual handling steps and 
transfers. Such procedures are labor-intensive, time-consuming, and costly and 
they are susceptible to human error, sample contamination, and loss. After 

25 samples have been prepared, they can be subjected to testing procedures that 
produce data for analysis. Conventional testing procedures often must be 
performed by an individual laboratory technician, one sample at a time. 
Laboratory technicians are typically individuals who are most likely trained to 
operate only a single instrument. Automation will reduce the number of 

30 personnel and training necessary to carry out the research. Reliable and 

accurate automated process and analysis tools are necessary for the benefits of 
recent scientific discoveries to be fully achieved. 
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Genomic research is increasing the availability of genomic markers that 
can be used for the identification of all organisms, including humans. These 
markers (all genetic loci including SNPs, microsatellites and other noncoding 
genomic regions) provide a way to not only identify populations but also allow 
5 stratification of populations according to their response to drug treatment, 
resistance to environmental agents, and other factors. Importantly, the 
identification of the large number of genomic markers has become the driving 
force behind the development of new automated technologies. 

At the forefront of the efforts to develop better analytical tools are efforts 

10 to expedite the analysis of complex biochemical structures. For example, robotic 
devices have been employed to assist in sample preparation and handling. 

Such automated sample preparation systems could find application is the 
areas of: identification and validation of disease-causing genes or drug targets; 
defining mutations and polymorphisms, associated with specific diseases; 

1 5 monitoring gene expression and comparing disease states, cell cycles or other 
changes; genetic profiling of patients for responsiveness to genomics-based 
therapies; and genetic profiling of subjects in drug clinical studies to link 
response with genotype. 

The utility of genomic markers to identify and stratify populations is 

20 depending on the industry's ability to measure great numbers (100-100,000) of 
markers in large populations. This approach is extremely limited in terms of time 
and research costs. Automation of these systems provides advantages such as 
increasing throughput and accuracy, but miniaturization also is an important 
consideration in terms of research costs. Accordingly, there is a need to 

25 automate processes in which very small volumes are handled, and retain the 

accuracy of the results to permit their use in high throughput screening protocols 
and diagnostics. 

Therefore it is an object herein to provide automated systems and 
methods for high-throughput analysis of biological samples, particularly samples 
30 of very small volume, for screening, diagnosis and other procedures. Other 
objects will become apparent from the following disclosure. 
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SUMMARY OF THE INVENTION 

Provided herein is a fully automated modular analytical system that 
integrates sample preparation, instrumentation, and analysis of biopolymer 
samples. The samples include, but are not limited to, those containing 
5 biopolymers, such as but are not limited to, nucleic acids, proteins, peptides, 
carbohydrates, PNA (peptide nucleic acids), biopolymer (nucleic acid/peptide) 
analogs, and libraries of combinatorial molecules. Samples of interest include 
biological samples, such as but are not limited to, tissue and body fluid samples 
from humans and other mammals. 

10 The system integrates analytical methods of detection and analysis, 

including but are not limited to, mass spectrometry, radiolabeling, mass tags, 
chemical tags, fluorescence and chemiluminescence, with robotic technology 
and automated chemical reaction systems to provide a high-throughput, accurate 
system called an Automated Process Line (APL). The systems and methods 

15 provided herein are particularly suited for handling very small volumes, on the 
order of milliliters, microliters and submicroliters, nanoliters and even smaller 
picoliter volumes. 

In certain embodiments, the analytical system includes one portion that is 
a contamination-controlled environment, such as a clean room or laminar flow 

20 room, and includes a means, such as a transporter, for moving the samples from 
such environment into a second room or space for further processing. This dual 
space system permits performance of procedures that require clean room 
conditions to be automatedly linked to procedures that do not require such 
conditions. The systems are particularly useful for analysis of nucleic acid and 

25 protein samples, such as detection of polymorphisms, particularly single 

nucleotide polymorphisms, and particularly, using mass spectrometric analyses. 

Integrated systems, such as the system exemplified herein and 
designated an Automated Process Line (APL), for performing a reactions is 
provided. It includes a process line that has a plurality of processing stations, 

30 each of which performs a procedure on a biological sample contained in a 
reaction vessel; a robotic system that transports the reaction vessel from 
processing station to processing station; a control system that determines when 
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the procedure at each processing station is complete and, in response, moves 
the reaction vessel to the next test station, and continuously processes reaction 
vessels one after another until the control system receives a stop instruction; 
and a data analysis system that receives test results of the process line and 
5 automatically processes the test results to make a determination regarding the 
biological sample in the reaction vessel is provided. 

The automated system can run unattended continuously with a 
continuous sample throughput and is capable of analyzing on the order of 
10,000-50,000 genotypes per day. The results are highly accurate and 
10 reproducible. 

Also provided herein are methods for automated analysis of biopolymers 
using the integrated automated highthroughput system. In preferred 
embodiments, provided are automated methods for preparing a biological sample 
for analysis; introducing the sample into an analytical instrument; recording 

15 sample data; automatically processing and interpreting the data; and storing the 
data in a bioinformatics database. In a particular embodiment, patient DNA 
samples are automatically analyzed to determine genotype. 
BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a diagram of the components of the automated process line. 

20 FIG. 2 shows a magnetic strip construction of the magnetic lift illustrated 

in FIG. 1. 

FIG. 3 shows a point-magnet construction of the magnetic lift illustrated 
in FIG. 1. 

FIG. 4 shows the robotic interface between the chip processor and the 
25 mass spectrometer of the automated process line illustrated in FIG. 1. 

FIG. 5 shows a comparison of a mass spectrum of a test sample with 
stored spectra from samples with known genotypes. 

FIG. 6 is a flow diagram that illustrates the data analysis processing steps 
performed by the automated process line of FIG. 1. 
30 FIG. 7 shows an example of the user interface to the automated high 

throughput system (the APL automated system). 
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FIG. 8 shows an example of the interface to a database of experimental 
mass spectral data. 

DETAILED DESCRIPTION AND PREFERRED EMBODIMENTS 
Definitions 

5 Unless defined otherwise, all technical and scientific terms used herein 

have the same meaning as is commonly understood by one of skill in the art to 
which this invention belongs. AH patents, patent applications and publications 
referred to throughout the disclosure, including the background, herein are, 
unless noted otherwise, incorporated by reference in their entirety. In the event 

10 a definition in this section is not consistent with definitions elsewhere, the 
definition set forth in this section will control. 

As used herein, a molecule refers to any molecule or compound that is 
linked that is analyzed by the methods contempated herein. For purposes 
herein, the molecules are often linked to a solid support, such as a bead. 

15 Typically such molecules are biological particles or macromolecules or 

components or precursors thereof, such as peptides, proteins, small organics, 
oligonucleotides or monomeric units of the peptides, organics, nucleic acids and 
other macromolecules. A monomeric unit refers to one of the constituents from 
which the resulting compound is built. Thus, monomeric units include, but are 

20 not limited to, nucleotides, amino acids, and pharmacophores from which small 
organic molecules are synthesized. 

As used herein, macromolecule refers to any molecule having a molecular 
weight from the hundreds up to the millions. Macromolecules include peptides, 
proteins, nucleotides, nucleic acids, and other such molecules that are generally 

25 synthesized by biological organisms, but can be prepared synthetically or using 
recombinant molecular biology methods. 

As used herein, a biological particle refers to a virus, such as a viral 
vector or viral capsid with or without packaged nucleic acid, phage, including a 
phage vector or phage capsid, with or without encapsulated nucleotide acid, a 

30 single cell, including eukaryotic and prokaryotic cells or fragments thereof, a 

liposome or micellar agent or other packaging particle, and other such biological 
materials. For purposes herein, biological particles include molecules that are not 
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typically considered macromolecules because they are not generally synthesized, 
but are derived from cells and viruses. 

As used herein, the term "nucleic acid* refers to single-stranded and/or 
double-stranded polynucleotides such as deoxyribonucleic acid (DNA), and 
5 ribonucleic acid (RNA) as well as analogs or derivatives of either RNA or DNA. 
Also included in the term "nucleic acid" are analogs of nucleic acids such as 
peptide nucleic acid (PIMA), phosphorothioate DNA, and other such analogs and 
derivatives. 

As used herein, the term "biological sample" refers to any material 

10 obtained from any living source, including but are not limited to animal, 

particularly mammals, preferably humans, plant, bacterial, fungal, protist viral 
samples. For purposes herein, the biological sample preferably contains a 
nucleic acid molecule. Examples of appropriate biological samples include, but 
are not limited to: solid materials ( e.g. . tissue, cell pellets, biopsies) and 

15 biological fluids (e.g., urine, blood, saliva, amniotic fluid, mouth wash, cerebral 
spinal fluid and other body fluids). 

As used herein, the phrases "chain-elongating nucleotides" and "chain- 
terminating nucleotides" are used in accordance with their art recognized 
meaning. For example, for DNA, chain-elongating nucleotides include 

20 2'deoxyribonucleotides ( e.g. , dATP, dCTP, dGTP and dTTP) and chain- 
terminating nucleotides include 2', 3'-dideoxyribonucleotides (e.g. . ddATP, 
ddCTP, ddGTP, ddTTP). For RNA, chain-elongating nucleotides include 
ribonucleotides (e^, ATP, CTP, GTP and UTP) and chain-terminating nucleotides 
include 3'-deoxyribonucleotides (e.g. . 3'dA, 3'dC, 3'dG and 3'dU). A complete 

25 set of chain elongating nucleotides refers to dATP, dCTP, dGTP and dTTP. The 
term "nucleotide* is also well known in the art. 

As used herein, nucleotides include nucleoside mono-, di-, and 
triphosphates. Nucleotides also include modified nucleotides such as 
phosphorothioate nucleotides and deazapurine nucleotides. A complete set of 

30 chain-elongating nucleotides refers to four different nucleotides that can 
hybridize to each of the four different bases comprising the DNA template. 
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As used herein, "multiplexing" refers to the simultaneously detection of 
more than one analyte, such as more than one (mutated) loci on a particular 
captured nucleic acid fragment (on one spot of an array). 

As used herein, the term "biopolymer" is used to mean a biological 
5 molecule composed of two or more monomeric subunits, or derivatives thereof, 
which are linked by a bond or a macromolecule. A biopolymer can be, for 
example, a polynucleotide, a polypeptide, a carbohydrate, or a lipid, or 
derivatives or combinations thereof, for example, a nucleic acid molecule 
containing a peptide nucleic acid portion or a glycoprotein, respectively. The 
10 methods and systems herein, though described with reference to biopolymers, 
can be adapted for use with other synthetic schemes and assays, such as 
organic syntheses of pharmaceuticals, or inorganics and any other reaction or 
assay performed on a solid support or in a well in nanoliter volumes. 

As used herein, the term "nucleic acid" refers to single-stranded and/or 
15 double-stranded polynucleotides such as deoxyribonucleic acid (DNA), and 

ribonucleic acid (RNA) as well as analogs or derivatives of either RNA or DNA. 
Also included in the term "nucleic acid" are analogs of nucleic acids such as 
peptide nucleic acid (PIMA), phosphorothioate DNA, and other such analogs and 
derivatives. 

20 As used herein, the term "polynucleotide" refers to an oligomer or 

polymer containing at least two linked nucleotides or nucleotide derivatives, 
including a deoxyribonucleic acid (DNA), a ribonucleic acid (RNA), and a DNA or 
RNA derivative containing, for example, a nucleotide analog or a "backbone" 
bond other than a phosphodiester bond, for example, a phosphotri ester bond, a 

25 phosphoramidate bond, a phophorothioate bond, a thioester bond, or a peptide 
bond (peptide nucleic acid). The term "oligonucleotide" also is used herein 
essentially synonymously with "polynucleotide," although those in the art will 
recognize that oligonucleotides, for example, PCR primers, generally are less 
than about fifty to one hundred nucleotides in length. 

30 Nucleotide analogs contained in a polynucleotide can be, for example, 

mass modified nucleotides, which allows for mass differentiation of 
polynucleotides; nucleotides containing a detectable label such as a fluorescent. 
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radioactive, luminescent or chemiluminescent label, which allows for detection of 
a polynucleotide; or nucleotides containing a reactive group such as biotin or a 
thiol group, which facilitates immobilization of a polynucleotide to a solid 
support. A polynucleotide also can contain one or more backbone bonds that 
5 are selectively cleavable, for example, chemically, enzymatically or 
photolyticalty. For example, a polynucleotide can include one or more 
deoxyribonucleotides, followed by one or more ribonucleotides, which can be 
followed by one or more deoxyribonucleotides, such a sequence being cleavable 
at the ribonucleotide sequence by base hydrolysis. A polynucleotide also can 

10 contain one or more bonds that are relatively resistant to cleavage, for example, 
a chimeric oligonucleotide primer, which can include nucleotides linked by 
peptide nucleic acid bonds and at least one nucleotide at the 3' end, which is 
linked by a phosphodiester bond, or the like, and is capable of being extended by 
a polymerase. Peptide nucleic acid sequences can be prepared using well known 

15 methods (see, for example, Weiler era/.. Nucleic acids Res. 25:2792-2799 
(1997)). 

A polynucleotide can be a portion of a larger nucleic acid molecule, for 
example, a portion of a gene, which can contain a polymorphic region, or a 
portion of an extragenic region of a chromosome, for example, a portion of a 

20 region of nucleotide repeats such as a short tandem repeat (STR) locus, a 
variable number of tandem repeats (VNTR) locus, a microsatellite locus or a 
minisateilite locus. A polynucleotide also can be single stranded or double 
stranded, including, for example, a DNA-RNA hybrid, or can be triple stranded or 
four stranded. Where the polynucleotide is double stranded DNIA, it can be in an 

25 A, B, L or 2 configuration, and a single polynucleotide can contain combinations 
of such configurations. 

As used herein, the term "polypeptide," means at least two amino acids, 
or amino acid derivatives, including mass modified amino acids and amino acid 
analogs, that are linked by a peptide bond, which can be a modified peptide 

30 bond. A polypeptide can be translated from a polynucleotide, which can include 
at least a portion of a coding sequence, or a portion of a nucleotide sequence 
that is not naturally translated due, for example, to it being located in a reading 
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frame other than a coding frame, or it being an intron sequence, a 3' or 5' 
untranslated sequence, a regulatory sequence such as a promoter, or the like. A 
polypeptide also can be chemically synthesized and can be modified by chemical 
or enzymatic methods following translation or chemical synthesis. The terms 
5 "polypeptide," "peptide" and "protein" are used essentially synonymously 

herein, although the skilled artisan will recognize that peptides generally contain 
fewer than about fifty to one hundred amino acid residues, and that proteins 
often are obtained from a natural source and can contain, for example, post- 
translattonal modifications. A polypeptide can be post-translationally modified by 

10 phosphorylation (phosphoproteins), glycosylation (glycoproteins, proteoglycans), 
and the like, which can be performed in a cell or in a reaction in vitro. 

As used herein, the term "conjugated" refers stable attachment, • 
preferably ionic or covalent attachment. Among preferred conjugation means 
are: streptavidin- or avidin- to biotin interaction; hydrophobic interaction; 

15 magnetic interaction (e.g. . using functionalized magnetic beads, such as 

DYNABEADS, which are streptavidin-coated magnetic beads sold by Dynal, Inc. 
Great Neck, NY and Oslo Norway); polar interactions, such as "wetting" 
associations between two polar surfaces or between oiigo/polyethylene glycol; 
formation of a covalent bond, such as an amide bond, disulfide bond, thioether 

20 bond, or via crossiinking agents; and via an acid-labile or photocleavable linker. 

As used herein equivalent, when referring to two sequences of nucleic 
acids means that the two sequences in question encode the same sequence of 
amino acids or equivalent proteins. When "equivalent" is used in referring to 
two proteins or peptides, it means that the two proteins or peptides have 

25 substantially the same amino acid sequence with only conservative amino acid 
substitutions that do not substantially alter the activity or function of the protein 
or peptide. When "equivalent" refers to a property, the property does not need 
to be present to the same extent [e.g.. two peptides can exhibit different rates 
of the same type of enzymatic activity], but the activities are preferably 

30 substantially the same. "Complementary," when referring to two nucleotide 
sequences, means that the two sequences of nucleotides are capable of 
hybridizing, preferably with less than 25%, more preferably with less than 15%, 



WO 00/60361 



PCT/USOO/08111 



-10- 

even more preferably with less than 5%, most preferably with no mismatches 
between opposed nucleotides. Preferably the two molecules will hybridize under 
conditions of high stringency. 

As used herein: stringency of hybridization in determining percentage 
5 mismatch are those conditions understood by those of skill in the art and 
typically are substantially equivalent to the following: 

1) high stringency: 0.1 x SSPE, 0.1 % SDS, 65 °C 

2) medium stringency: 0.2 x SSPE, 0.1% SDS, 50°C 

3) low stringency: 1.0 x SSPE, 0.1% SDS, 50°C 

10 It is understood that equivalent stringencies may be achieved using alternative 
buffers, salts and temperatures. 

As used herein, a primer when set forth in the claims refers to a primer 
suitable for mass spectrometry methods requiring immobilizing, hybridizing, 
strand displacement, sequencing mass spectrometry refers to a nucleic acid 

15 must be of low enough mass, typically about 70 nucleotides or less than 70, and 
of sufficient size to be useful in the mass spectrometric methods described 
herein that rely on mass spectrometric detection. These methods include 
primers for detection and sequencing of nucleic acids, which require a sufficient 
number nucleotides to from a stable duplex, typically about 6-30, preferably 

20 about 10-25, more preferably about 12-20. Thus, for purposes herein a primer 
will be a sequence of nucleotides comprising about 6-70, more preferably a 1 2- 
70, more preferably greater than about 14 to an upper limit of 70, depending 
upon sequence and application of the primer. The primers herein, for example for 
mutational analyses, are selected to be upstream of loci useful for diagnosis 

25 such that when performing using sequencing up to or through the site of 

interest, the resulting fragment is of a mass that sufficient and not too large to 
be detected by mass spectrometry. For mass spectrometric methods, mass 
tags or modifier are preferably included at the 5 '-end, and the primer is 
otherwise unlabeled. 
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As used herein, H conditioning" of a nucleic acid refers to modification of 
the phosphodiester backbone of the nucleic acid molecule ( e.g. , cation 
exchange) for the purpose of eliminating peak broadening due to a heterogeneity 
in the cations bound per nucleotide unit. Contacting a nucleic acid molecule 
5 with an alkylating agent such as akyliodide, iodoacetamide, /?-todoethanol, or 
2,3-epoxy-1 -propanol, the monothio phosphodiester bonds of a nucleic acid 
molecule can be transformed into a phosphotriester bond. Likewise, 
phosphodiester bonds may be transformed to uncharged derivatives employing 
trialkylsilyl chlorides. Further conditioning involves incorporating nucleotides that 
10 reduce sensitivity for depurination (fragmentation during MS) e.g. , a purine 
analog such as N7- or N9~deazapurine nucleotides, or RNA building blocks or 
using oligonucleotide triesters or incorporating phosphorothioate functions that 
are alkylated or employing oligonucleotide mimetics such as peptide nucleic acid 
(PNA). 

15 As used herein, the term "solid support" means a non-gaseous, non-liquid 

material having a surface. Thus, a solid support can be a fiat surface 
constructed, for example, of glass, silicon, metal, plastic or a composite; or can 
be in the form of a bead such as a silica gel, a controlled pore glass, a magnetic 
or cellulose bead; or can be a pin, including an array of pins suitable for 

20 combinatorial synthesis or analysis. 

As used herein, substrate refers to an insoluble support onto which a 
sample is deposited according to the materials described herein. Sample may be 
linked directly or via a linker and retained by any suitable means, including 
covalent, ionic and other bonds and other interactions. Examples of appropriate 

25 substrates include solid supports, but are not limited to, silica gel, controlled 
pore glass, magnetic beads, agaroase gels and crosslinked dextroses ( i.e. 
Sepharose and Sephadex), cellulose and other materials known to those of skill 
in the art, to serve as solid support matrices. For example, substrates may be 
formed from any or combinations of: silica gel, glass, magnetic materials, 

30 polystyrene/1 % divinylbenzene resins, such as Wang resins, which are 
Fmoc-amino acid-4-(hydroxymethyl)phenoxymethylcopoly(styrene-1 % 
divinylbenzene (DVD)) resin, chlorotrityl (2-chlorotritylchloride copoly- 
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styrene-DVB resin) resin,. Merrifield (chloromethylated copolystyrene-DVB) resin 
metal, plastic, cellulose, cross-linked dextrans, such as those sold under the 
trade name Sephadex (Pharmacia) and agarose gel, such as gels sold under the 
trade name Sepharose (Pharmacia), which is a hydrogen bonded 
5 polysaccharide-type agarose gel, and other such resins and solid phase supports 
known to those of skill in the art. The support matrices may be in any shape or 
form, including, but not limited to: capillaries, flat supports such as glass fiber 
filters, glass surfaces, metal surfaces (steel, gold, silver, aluminum, copper and 
silicon), plastic materials including multiwell plates or membranes (e.g., of 

10 polyethylene, polypropylene, polyamide, polyvinylidenedifluoride), pins (e.g., 
arrays of pins suitable for combinatorial synthesis or analysis or beads in pits of 
flat surfaces such as wafers (e.g., silicon wafers) with or without plates, and 
beads. The supports include any supports used for retaining or conjugating 
macromoiecuies and biopolymers, and biological particles. 

15 As used herein, a selectively cleavable linker is a linker that is cleaved 

under selected conditions, such as a photocleavable linker, a chemically 
cleavable linker and an enzymatically cleavable linker (i.e., a restriction 
endonuclease site or a ribonucleotide/RNase digestion). The linker is interposed 
between the support and immobilized DNA. 

20 As used herein, the term "liquid dispensing system" means a device that 

can transfer a predetermined amount of liquid to a target site. The amount of 
liquid dispensed and the rate at which the liquid dispensing system dispenses the 
liquid to a target site, which can contain a reaction mixture, can be adjusted 
manually or automatically, thereby allowing a predetermined volume of the liquid 

25 to be maintained at the target site. 

As used herein, the term " liquid" is used broadly to mean a non-solid, 
non-gaseous material, which can be homogeneous or heterogeneous and can 
contain one or more solid or gaseous materials dissolved or suspended therein. 
In general, a liquid is a component of a reaction mixture that is susceptible to 

30 evaporation under the conditions of the reaction. In particular, the liquid can be 
a solvent, in which a reaction is performed, for example water or glycerol/water 
or buffer or reaction mixture, where the reaction is performed in an aqueous 
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solution. The liquid can be any non-solid, non-gaseous solvent or other 
component of a reaction mixture that is susceptible to evaporative loss, for 
example, acetonitrile, which can be a solvent for a nucleic acid synthesis 
reaction; formamide, which can be a liquid component of a nucleic acid 
5 hybridization reaction; piperidine, which is a liquid component of a nucleic acid 
sequencing reaction; or any other non-aqueous solvent or other liquid 
component. A liquid can contain dissolved or suspended components, which 
can be useful, for example, for initiating, terminating or changing the conditions 
of a reaction, thereby facilitating the performance of single tube reactions. 

10 As used herein, the term "reaction mixture" refers to any mixture, 

including solutions and suspension, in which a chemical, physical or biological 
change is effected. In general, a change to a molecule is effected, although 
changes to cells also are contemplated. A reaction mixture can contain a 
solvent, which provides, in part, appropriate conditions for the change to be 

15 effected, and a substrate, upon which the change is effected. A reaction 

mixture also can contain various reagents, including buffers, salts, and metal 
cofactors, and can contain reagents specific to a reaction, for example, 
enzymes, nucleoside triphosphates, amino acids, and the like. For convenience, 
reference is made herein generally to a "component" of a reaction, wherein the 

20 component can be a cell or molecule present in a reaction mixture, including, for 
example, a biopolymer or a product thereof. 

As used herein, the term "target site" refers to a specific locus on a solid 
support that can contain a liquid. A solid support contains one or more target 
sites, which can be arranged randomly or in ordered array or other pattern. In 

25 particular, a target site restricts growth of a liquid to the "z" direction of an xyz 
coordinate. Thus, a target site can be, for example, a well or pit, a pin or bead, 
or a physical barrier that is positioned on a surface of the solid support, or 
combinations thereof such as a beads on a chip, chips in wells, or the like. A 
target site can be physically placed onto the support, can be etched on a surface 

30 of the support, can be a "tower" that remains following etching around a locus, 
or can be defined by physico-chemical parameters such as relative hydrophilicity, 
hydrophobicity, or any other surface chemistry that allows a liquid to grow 
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primarily in the z direction. A solid support can have a single target site, or can 
contain a number of target sites, which can be the same or different, and where 
the solid support contains more than one target site, the target sites can be 
arranged in any pattern, including, for example, an array, in which the location of 
5 each target site is defined. 

As used herein, the term "predetermined volume* 1 is used to mean any 
desired volume of a liquid. For example, where it is desirable to perform a 
reaction in a 5 microliter volume, 5 microliters is the predetermined volume. 
Similarly, where it is desired to deposit 200 nanoliters at a target site, 

10 200 nanoliters is the predetermined volume. 

As used herein, a small volume, typically refers to a volume on the order 
of nanoliters, preferably less than 1 microliter and typically, less than 0.5 
microliters and less. The term nanoliter volume refers to a volume of about 0.1 
to about 1000 nanoliters, preferably about 1 to 100 nanoliters. 

15 As used herein, symbology refers to the code, such as a bar code, that is 

engraved or imprinted on a surface. The symbology is any code known or 
designed by the user. 

As used herein, a bar codes refers a symbology that is any array of, 
preferably, optically readable marks of any desired size and shape that are 

20 arranged in a reference context or frame of, preferably, although not necessarily, 
one or more columns and one or more rows. For purposes herein, the bar code 
refers to any symbology, not necessary "bar" but may include dots, characters 
or any symbol or symbols. 

As used herein, the disclosed systems and methods generally are 

25 useful where the reaction volume is about 500 milliliters or less; are more useful 
where the reaction volume is about 5 milliliters or less; are most useful where 
the reaction volume is in the "submilliliter** range, for example, about 50O 
microliters, or about 50 microliters or about 5 microliters or less; and are 
particularly useful where the reaction volume is a "submicroliter" reaction 

30 volume, which can be measured in nanoliters, for example, about 500 nanoliters 
or less, or 50 nanoliters or less or 1 0 nanoliters or less, or can be measured in 
picoliters, for example, about 500 picoliters or less or about 50 picoliters or less. 
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For convenience of discussion, the term "submicroliter" is used herein to refer to 
a reaction volume less than about one microliter, although it will be readily 
apparent to those in the art that the systems and methods disclosed herein are 
applicable to subnanoliter reaction volumes as well. 
5 As used herein, a "peak* or graph encompasses the visual depiction 

thereof and also the numerical values from which the visual depiction is 
generated. A mass spectrum, for example, does not have to be depicted as 
peaks, but can be depicted in other formats, including numerical values from 
which the graph of visual depiction is generated. 

10 As used herein, a room refers to a space, such as a room, chamber or a 

hood or other enclosure that is in some manner separated. In an embodiment 
herein, the system is designed to operate in two rooms, such that manipulations 
that require sterile conditions can be performed in one room or chamber. 
Manipulations that do not require such conditions can be performed in a second 

15 room. Samples can then be automatically transported between the first room 
and second room. As desired additional rooms, with conditions designed for a 
particular set of manipulations may be included in the system. 
Automated High Throughput Integrated System 

In the automated system constructed in accordance with the disclosure 
20 herein, one or more robotic systems under computer control are used to 

manipulate the sample of interest. The robot(s) are commanded by controlling 
software and move the sample between the series of reaction and sample 
preparation stations that comprise the system. The robot includes a robotic arm 
that moves, for example, along a track or on a central pivot, and is typically 
25 outfitted with a "gripper" arm, allowing it to grip reaction vessels and transport 
them between stations. Such robotic systems are commercially available and 
are commonly known to those of skill in the art. For example, a robotic system 
and accompanying software can be obtained from Robocon Labor-und 
Industrieroboter Ges.m.b.H of Austria ("Robocon"). In a preferred embodiment, 
30 the system is the APL automated system and it includes a Robocon "Model CRS 
A 255" robot, equipped with a "Digital Servo Gripper" mechanism, also available 
from Robocon. The robotic systems are designed such that they can be 
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integrated with other computer-controlled instrumentation to perform 
consecutive operations to effect a multi-step process. 

In the preferred embodiment, one robot moves along a central track in a 
contamination-controlled environment, such as a positive airflow or laminar flow 
5 chamber, to perform a series of manipulations or reactions on a biological 
sample. Once these steps are completed, the sample enters a second 
contamination-controlled environment, which serves as an antechamber into a 
non-sterile environment. The second environment can be sealed off from the 
first contamination-controlled environment and/or the non-sterile environment. 
10 For example, in a particular embodiment, the sample is transported from the 

contamination-controlled laminar flow chamber into a transport chamber, or so- 
called "taxicab." If desired, the taxicab can provide a sterile environment as 
well. 

Upon entry of the sample into the transport chamber, the contamination- 

15 controlled environment is sealed off. The sample then moves along a 

pneumatically-driven or motor-driven stage in the transport chamber, and the 
transport chamber then opens up into the, non-sterile environment, such as an 
open room. In the open room, a second robot, also moving along a central 
track, takes control of manipulating the sample. 

20 The sample to be analyzed is contained within a reaction vessel that is 

designed to integrate with all of the components of the system and that is 
amenable to the conditions of the chemical or biological reactions performed. 
Preferred for high throughput analysis are reaction vessels that are capable of 
containing multiple samples, such as multi-well microtiter plates, preferably 96- 

25 well or 384-well plates or chips, such as silicon microchips. The reaction 

vessels also can comprise flat chips with reaction sites which are not wells, but 
physical locations that contain the reaction using a chemical barrier. In certain 
embodiments, the robot and/or grtpper is adapted to hold a sample vessel. For 
example, pins may be added to the gripper in alignment with the wells of a 

30 microtiter plate for transporting the sample. 

In high-throughput applications, where multiple sample plates are to be 
analyzed successively in an automated fashion, the samples can be held in a 
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sample storage system, or rack, where they are picked up by the system robot 
and processed. An example of such a sample storage system, for use with 
multi-well microtiter plates, is the Robocon "Plate Cube" system. 

In steps where sample vessels are to be sealed, such as when subjected 
5 to PCR amplification, or unsealed, such as for reagent addition or removal, an 
automated lid application/removal and sealing system may be integrated into the 
system. Examples of these include a lid parking station, such as is available 
from Robocon, and a plate sealer, such as the "MJ Microseal", available from 
MJ Research. A system turntable might also be employed to assist the system 

10 robot in orienting the samples for delivery into each station of the system. Such 
a turntable is available, for example, from Robocon. Additionally, a shaker is 
also included in the system in embodiments where beads, solid supports or other 
reagents are added to the sample for immobilizing the sample, or where other 
manipulations requiring mechanical shaking are involved. 

15 In preferred embodiments, the sample plate or vessel is coded with a 

symbology, such as a bar code, which can be read by a reader, to allow sample 
tracking. In the preferred embodiment, separate bar code readers are contained 
in the contamination-controlled and non-sterile environments. Bar code systems, 
including one and two dimensional bar codes, readable and readable/writable 

20 codes and systems therefor, are widely available, such as from Datalogic S.p.A. 
of Italy ("Datalogic"), and are well known to those skilled in the art. 

Sample handling and reagent additions are accomplished using automated 
liquid handling systems. These include systems capable of automatically 
dispensing liquids into the sample vessel, such as through a pipette, and can be 

25 adapted to any sample format, such as a multiwell microtiter plate. Such 
systems are commercially available, such as from Tecan AG of Switzerland 
("Tecan") or Beckman Coulter, Inc. In a preferred embodiment, Tecan "Genesis 
200/8" (200 cm with including an 8-tip arm) liquid handling systems, as well as 
a Beckman Coulter "Multimek 96" automated pipettor are used for liquid hand- 

30 ling. Other liquid dispensing systems are described in allowed U.S. application 
Serial No. 08/787,639 now U.S. Patent No. 6,024,925, U.S. application Serial 
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No. 08/786,988, and published International PCT application No. WO 98/20166, 
which are incorporated herein by reference. 

Also present in the system may be an apparatus for preparing a test 
sample for analysis, including, for example, reagent addition means, or other 
5 means for performing reactions or processes to prepare the sample for analysis. 
In certain preferred embodiments, where mass spectral analysis, specifically 
MALDI-TOF analysis, is to be performed using a sample array, a matrix material 
(i.e. . an organic acid) is added to the sample using an adapted piezoelectric 
pipetting dispensing system. The dispensing system includes a hydrophobic tip, 

10 which is capable of dispensing submolar, preferably nanomolar, samples. Such 
systems, as well as methods for preparing and analyzing low volume analyte 
array elements, have been described in allowed U.S. Patent Application No. 
08/787,639, now U.S. Patent No. 6,024,925, U.S. application Serial No. 
08/786,988, and published International PCT application No. WO 98/20166, 

15 see, also Little et aL, Anal. Chem. 1997, 69, 4540-4546, the contents of which 
are incorporated by reference herein in their entirety. 

Alternatively, a system that dispenses liquid samples from the picoliter up 
to the nanoliter range is commercially available, such as the "Nano-Plotter" 
product from GeSiM GmbH of Germany ("GeSirVT). In other embodiments, 

20 reactions such as radiolabeling or adding a mass tag to the sample may be 
performed by the sample preparation apparatus. 

A sample may also be transferred to or placed in a particular sample 
analysis vessel for analysis. The particular type of sample analysis vessel used 
is determined by the analytical method to be employed. For example, in a 

25 preferred embodiment, where mass spectrometry (MALDI-TOF) is used for 

analysis of a sample, a typical sample vessel is a silicon microchip (< 1 square 
inch) that includes one or more, 100, 200, 300, 400, 500, up to 999 diagnostic 
sites (typically in multiples of 96, such as 96, 384 and higher densities), or even 
higher density, on a single chip, preferably in the pattern of a 2-D array. The 

30 chip, or multiple chips, can then be placed on a sample platform, designed 
specifically to be inserted into the mass spectrometer. 
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In a preferred embodiment, the analytical system is a MALDI-TOF mass 
spectrometer. A preferred mass spectrometer is manufactured by 
Bruker-Franzen Analytik GmbH of Germany ("Bruker") and uses a UV laser. In 
the spectrometer, a brief pulse of laser irradiation is absorbed by the matrix, 
5 leading to spontaneous volatization and ionization of the matrix and DNA 

fragments. The molecular weight of the gas-phase ions are then determined by 
measurement of the time-of-flight of ions, which is proportional to their mass. 

It should be understood that the nature of the sample to be analyzed and 
the analysis to be performed, as well as the feasibility of automating a reaction 

10 process, determine the components integrated into the system, and the system 
is not to be limited to the particular embodiments described herein. 

Module for performing the reaction in an unseated environment 
Systems for performing a reaction in an unsealed environment are 
provided in copending U.S. application Serial No. 09/266,409, filed March 10, 

15 1999 and International PCT application No. PCT/US00/06288, filed 

March 10, 2000). These systems may be integrated into the high throughput 
automated systems provided herein provided herein. Briefly the systems and 
methods provide a means for performing reactions in an unsealed environments, 
such as in unsealed containers and on unsealed surfaces, by monitoring and 

20 maintaining the reaction volume. The liquid generally is present on a surface of a 
solid support, at a target site, and the environment into which evaporation can 
occur is air. The systems and methods provide a means to maintain a volume 
of a liquid at a predetermined volume, where the volume otherwise would 
decrease below the predetermined volume due to evaporation. These systems 

25 include a support for performing the reaction; a nanoliter dispensing pipette for 
dispensing an amount of a liquid onto the surface of the support; a temperature 
controlling device for regulating the temperature of the support; and means for 
controlling the amount of liquid dispensed, wherein the amount of liquid 
dispensed corresponds to the amount of liquid that evaporates from the support, 

30 wherein the system is not sealed. Hence, inclusion of the systems for 
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performing the reactions, in an unsealed environment are contemplated for 
inclusion as a module in the systems provided herein. 
Analytical methods 

The systems herein can be used to perform a number of different 
5 reactions, dependent upon the nature of the sample and the analysis to be 
performed. The system is typically used to perform analysis on biological 
samples, typically biopolymers, including nucleic acids, proteins, peptides and 
carbohydrates. Methods of analysis of the biological samples include all known 
methods of analysis, including, but not limited to mass spectrometry (all light 

10 wavelengths), radiolabeling, mass tags, chemical tags, fluorescence, and 
chemiluminescence, and particularly mass spectrometry. 

in a preferred embodiment, the sample is a purified previously amplified 
portion of genomic DNA or genomic DNA sample. For analysis of DNA samples, 
reactions such as nucleic acid amplification ( e.g. , PCR, tigase chain reaction) and 

15 enzymatic reactions, such as primer oligonucleotide base extension (PROBE), 

nested PCR or sequencing, may be performed. In addition, the apparatus can be 
used for hybridization (sequencing and diagnostic) reactions, and endo- and 
exonuclease mapping of biopolymers. 

In certain embodiments, the sample may be immobilized on a solid 

20 support during all or part of the automated process. For example, enzymatic 

reactions, including diagnostics, such as a method designated primer oligo base 
extension (PROBE; see, e.g., published International PCT application No. WO 
98/20019 and U.S. Patent No. 6,043,031), nested PCR, sequencing, and other 
analytical and diagnostic procedures that are performed on solid supports (see, 

25 e.g.. U.S. Patent No. 5,605,798). Briefly PROBE uses a single detection primer 
followed by an oligonucleotide extension step to give products, which can be 
readily resolved by MALDI-TOF mass spectrometry. The products differ in 
length by a number of bases specific for a number of repeat units or for second 
site mutations within the repeated region. . The method is exemplified using as a 

30 model system the AluVpA polymorphism in intron 5 of the interferon-a receptor 
gene located on human chromosome 21, and the poly T tract of the splice 
acceptor site of intron 8 from the CFTR gene located on human chromosome 7. 
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The method is advantageously used for example, for determining identity, 
identifying mutations, familial relationship, HLA compatibility and other such 
markers .using PROBE-MS analysis of microsatellite DNA. In a preferred 
embodiment, the method includes the steps of a) obtaining a biological sample 
5 from two individuals; b) amplifying a region of DNA from each individual that 
contains two or more microsatellite DNA repeat sequences; c) 
ionizing/volatilizing the amplified DNA; d) detecting the presence of the amplified 
DNA and comparing the molecular weight of the amplified DNA. Different sizes 
are indicative of non-identity ( i.e. wild-type versus mutation), non-heredity or 
10 non-compatibility; similar size fragments indicate the possibility identity, of 
familial relationship, or HLA compatibility. More than one marker may be 
examined simultaneoulsy, primers with different linker moieties are used for 
immobilization. 

As noted solid supports include, but are not limited to, flat surfaces, 
15 microtiter plates, beads, wafers, chips, and silicon support. Compositions and 
methods for immobilizing nucleic acids to solid supports, including methods for 
high density immobilization of nucleic acids are described in U.S. Patent 
Application Nos. 08/746,055 and 08/947,801 and published International PCT 
application No. WO 98/20166. Linkers for immobilizing nucleic acids to solid 
20 supports are well known. Linkers may be reversisble or irreversible. A target 
detection site can be directly linked to a solid support via a reversible or 
irreversible bond between an appropriate functionality (L') on the target nucleic 
acid molecule (T) and an appropriate functionality (L) on the capture molecule 
(FIGURE 1B). A reversible linkage can be such that it is cleaved under the 
25 conditions of mass spectrometry (i.e., a photocleavable bond such as a charge 
transfer complex or a labile bond being formed between relatively stable organic 
radicals). 

Photocleavable linkers are linkers that are cleaved upon exposure to light 
(see, e.g. , Goldmacher et aL (1 992) Bioconj. Chem. 3:1 04-1 07), thereby 
30 releasing the targeted agent upon exposure to light. Photocleavable linkers that 
are cleaved upon exposure to light are known (see, e.g. , Hazum et aL (1981) in 
Pept., Proc. Eur. Pept. Svmp.. 16th . Brunfeldt, K (Ed), pp. 105-110, which 
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describes the use of a nitrobenzyl group as a photocleavable protective group for 
cysteine; Yen etaL (1 989) Makromol. Chem 190 :69-82. which describes water 
soluble photocleavable copolymers, including hydroxypropylmethacrylamide 
copolymer, glycine copolymer, fluorescein copolymer and methylrhodamine 
5 copolymer; Goldmacher et ak (1992) Bioconi. Chem. 3:104-107, which des- 
cribes a cross-linker and reagent that undergoes photolytic degradation upon 
exposure to near UV light (350 nm); and Senter et aL (1 985) Photochem. 
Photobiol 42:231-237, which describes nitrobenzyloxycarbonyl chloride cross 
linking reagents that produce photocleavable linkages), thereby releasing the 

10 targeted agent upon exposure to light. In preferred embodiments, the nucleic 
acid is immobilized using the photocleavable linker moiety that is cleaved during 
mass spectrometry. Exemplary photocleavable linkers are set forth in published 
International PCT application No. WO 98/20019. Bead linkers for immobilizing 
nucleic acids to solid supports are described in allowed U.S. application Serial 

15 No. 08/746,036 now U.S. Patent No. 5,900,481, and published International 
PCT application No. WO 98/20166 and WO 98/20020. 

Preferred applications include, but are not limited to, sequencing and 
diagnostics based on analysis of nucleic acids and polypeptides or diagnostics by 
mass spectrometry. Preferred mass spectrometric methods include ionization (I) 

20 techniques including, but not limited to, matrix assisted laser desorption 

(MALDI), continuous or pulsed electrospray (ESI) and related methods (e.g. 
lonspray or Thermospray), or massive cluster impact (MCI); the ion sources can 
be matched with detection formats including linear or non-linear reflectron time- 
of-flight (TOF), single or multiple quadruple, single or multiple magnetic sector, 

25 Fourier Transform ion cyclotron resonance (FTICR), ion trap, and combinations 
thereof (e.g., ion-trap/time-of-f light). For ionization, numerous 
matrix/wavelength combinations (MALDI) or solvent combinations (ESI) can be 
employed. DNA sequencing by mass spectrometry is described in U.S. Patent 
No. 5,547,835; U.S. Patent No. 5,691,141; and related U.S. application Serial 

30 Nos. 08/467,208, 08/481,033 and 08/617,010 and in PCT Patent Application 
Nos. Atty. Docket No. 24736-2007PC, filed December 15, 1998, published 
International PCT application Nos. WO 94/16101 and WO 97/37041. 
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DNA sequencing using mass spectrometry is described in U.S. Patent No. 
5,547,835. DNA sequencing by mass spectrometry via exonuclease 
degradation is described in allowed U.S. application Serial No. 08/744,590, U.S. 
Patent No. 5,622,824, published International PCT application No. 
5 PCT/US94/02938, U.S. Patent No. 5,851,765, and U.S. Patent No. 5,872,003. 
Processes for direct sequencing during template amplification is described in 
allowed U.S. Patent Application No. 08/647,368 and published International PCT 
application No. WO 97/42348. 

DNA diagnostics based on mass spectrometry are described in U.S. 

10 Patent No. 5,605,798 and published International PCT application Nos. WO 
96/29431 and WO 98/20019. Diagnostics based on mass spectrometric 
detection of translated target polypeptides are described in U.S. Application No. 
08/922,201 and published International PCT application No. WO 99/12040. 
Mass spectrometric detection of polypeptides is described in U.S. Patent 

15 Application No. 08/922,201 and U.S. application Serial No. 09/146,054. 

It is understood that the nature of the sample to be analyzed and the 
analysis to be performed, as well as the feasibility of automating a reaction 
process, determine the methods used in the system, and the methods are not to 
be limited to the particular embodiments described herein. Any method and 

20 process that requires small volumes and involves one or more steps in the 
exemplified embodiment may be adapted and used in a system as described 
herein. 

Exemplary Embodiment 

One preferred embodiment, which is a dual space system, integrates 
25 nucleic acid amplification (via PCR), immobilization of the nucleic acid on a solid 
support, followed by enzymatic reaction ( e.g. , PROBE, mass array, sequencing, 
nested PCR), sample conditioning, addition of an organic acid matrix for MALDI- 
TOF analysis and MALDI-TOF analysis on a microchip. This embodiment is 
described with respect to the Automated Process Line (APL) system 100 
30 depicted in FIG. 1. As noted above, samples are initially prepared in a 

contamination-controlled environment 102, such as a clean room or laminar flow 
room, and are moved by a sterile transport chamber 104 or taxicab into a non- 
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sterile environment 106. .In RG. 1, samples are indicated by rectangular 
elements with criss-crossed lines. 

In the RG. 1 embodiment, sample preparation begins in a Liquid Handling 
System 108, such as the Tecan "Genesis 200/8 Robotic Sample Processor" 
5 product. One or more samples 110 of purified genomic DMA are delivered by a 
robot 112 to 96-well or 384-well microtiter plates 114 in the Liquid Handling 
System 108, preferably using a 200 cm instrument width and an 8-tip arm. 
These sample processing steps occur in the contamination-controlled 
environment 102. Multiple samples may be included in the APL system for high- 

10 throughput processing. These samples may, at times during processing, be held 
in a sample storage apparatus, such as the "Plate Cube" rack 116 available from 
Robocon. To the sample plates 114 are added a PCR reaction mix 118, 
including PCR primers, where one of the primers is labeled at the 5' end with 
functionality, such as biotin, that can be used to immobilize the amplicon to a 

15 solid support is added to the sample mixture. Where multiple samples are to be 
processed, a wash solution is contained in a reservoir 120 and is used to clean 
the pipette tips to prevent cross-contamination between samples or reagents. 
Alternatively, the APL system can process multiple samples using disposable 
pipette tips. 

20 The sample plates are manipulated by a robotic system, for example the 

Robocon robot 112, such as the CRS A 255 Robot, which moves along a central 
track 122. The robot 112 operates under control of a clean room control system 
computer 124 that includes a central processing unit (CPU) 126, a operator 
interface 128, and an APL interface 130. The CPU can comprise any 

25 commercially available desktop computer, such as an IBM-compatible personal 
computer (PC) or the like, The operator interface 128 includes a visual display 
and keyboard or other device through which an operator provides commands. 
The APL interface 130 is an interface between the computer and the process 
line, through which the computer 124 controls the robot. The APL interface 

30 may include, for example, a robot control program installed in the computer 124 
and available from Robocon for control of its robot products. An optional second 
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computer 131 can assist the first computer 124 in performing clean room 
processing. 

The robotic arm is equipped with a gripper 132, such as the "Digital 
Servo Gripper" arm, also available from Robocon, to pick up and drop off the 
5 sample plates 114 as needed, for processing. In a particular embodiment, a 
microtiter plate is aligned with the gripper so the plate receives pins 134 of the 
gripper, which more securely couple the plate with the gripper for more secure 
transport. 

FIG. 1 shows a sample plate 140, including the sample and PCR mix, that 

10 is moved to a turntable 142 and oriented such that the robot picks it up and 
moves it to a bar code reader 144, for example, as is available from Datalogic. 
where the bar code is read and recorded for sample tracking. Sample tracking 
and reorientation may be performed multiple times during sample processing to 
assist the robot in sample handling. 

15 The sample plate 140 is reoriented by the robotic arm, using the 

turntable, and is then placed in a lid parking station 146, such as is available as 
a robotic module in the Robocon robotic system. At the lid parking station, a lid 
may be parked or retrieved. In the preferred embodiment, the lid is a solid 
structure, such as a metal lid, with a flexible seal such that placing the lid on the 

20 plate seals the contents of the plate. The sealing eliminates evaporation during 
subsequent processing, such as PCR amplification. Such a sealing apparatus, 
known as "MJ Microseal", is available from MJ Research, Inc. Alternatively, 
after the sample plate is reoriented, it can be penetrably sealed. For example, 
the sample plate can be covered with a foil wrap that can later be penetrated by 

25 test probes or the like. A similar penetrable seal can be provided by a parafilm 
that is attached to the plate by heat, or other plastic or wax based sealers. 

The sealed sample plate is then picked up by the robotic gripper arm and 
transported from the laminar flow environment 102 into the taxicab transport 
station 104, which provides a sterile environment. First, an entry door opens in 

30 the taxicab to permit the robot to place the sample plate into the taxicab. Once 
in the taxicab 104, the entry door closes behind the sample to prevent 
contamination. Within the taxicab transport station 104, the sample plate is 
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ptaced onto and is transported along a pneumatically driven stage, and a second 
door opens to permit the sample to exit the taxicab into a non-sterile 
environment. Once outside the sterile taxicab environment, control of sample 
manipulation is transferred to a second robot 1 50, also equipped with a gripper 
5 152 and moving along a center track 153. The sample plate is transported by 
the robot 150 and is read by a second bar code reader 154 for sample tracking. 
The second bar code reader 154, as well as a second turntable 156, lid park 
station 158 and sample storage rack 160 are included outside the 
contamination-controlled area 102 for more efficient sample handling. 

10 The robot 150 operates under control of a PCR Room computer 161 that 

has a construction similar to the Clean Room computer 124. Thus, the PCR 
Room computer 161 can comprise any commercially available desktop computer 
that can interface with the APL system process line and stations. 

After the sample identification code has been read by the bar code reader 

15 1 52, the sample plate is moved by the system robot 1 50 to a PCR station 1 62, 
where amplification is carried out. The amplification reaction can be PCR, ligase 
chain reaction, etc. In a preferred embodiment, the "MJR Tetrad" thermocycler, 
available from MJ Research, Inc., is used for PCR amplification. Other PCR 
thermocycler systems are commonly known to those of skill in the art and may 

20 optionally be integrated into the system. Methods for DNA amplification are well 
known to those of skill in the art. Multiplex PCR can also be carried out using 
the system. 

After PCR amplification, the plates are removed from the PCR reaction 
station 162 by the robot 150. The plates are then moved to the lid park station 

25 158, where the lids are removed and unsealed. As noted above, however, a 
penetrable seal such as a foil wrap or parafilm is an alternative to a lid seal, and 
if removable lids are not used to seal the plates, then the lid park station is 
unnecessary and the next substance that must be added to the wells of the plate 
will be inserted upon piercing of the foil wrap. 

30 Alternatively, using a second liquid handling system 164, preferably a 

Tecan "Genesis 200/8" system, streptavidin-coated paramagnetic beads can be 
loaded from a reservoir 166 and mixed with the PCR-amplified DNA in the 
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sample plate, resulting in. immobilization of the amplicon via the functionaiized 
(e.g. biotinylated) primer. Beads are used, for example, where the samples are 
contained in multiwell microtiter plates. The beads and PCR products are 
reacted by shaking, using a shaking apparatus 168, such as is available from 
5 Robocon, and which is integrated into the APL system. 

The sample plates are then moved to a liquid handling and mixing station 
170, into which a magnetic lift station 172 has been incorporated, for post-PCR 
processing. In a preferred embodiment, the liquid handling station is a 
"Multimek 96" well pipetting station, available from Beckman. The magnetic lift 

10 applies magnets to the sample plate by moving the magnets up against the 
bottom of the sample plate, for example, by using a pneumatic lift, thereby 
immobilizing the DNA and beads, and the supernatant is removed. The magnets 
are then released and liquid is added to the wells to resuspend the sample. 
Alternatively, the sample plate could be moved, for example, by the robot to 

15 bring it into contact with the magnet. The magnet can be a solid surface that 
interacts with the entire bottom of the sample plate, or can be designed to more 
specifically interact with the individual samples. For example, where the sample 
plate is a 96-well microtiter plate, the magnet can be configured as 8 or 12 
individual strips so that each strip comes into contact with the bottom of a 

20 single row of wells. 

Conventionally, the magnets of the magnet lift station 172 are elongated, 
strip magnets arranged in rows between sample wells. Alternatively, the 
magnets can be configured as individual point magnets, for example, as disk- 
shaped magnets arranged into an 8 x 1 2 grid of magnets that correspond to the 

25 positions of the sample wells in a 96-well microtiter plate. This configuration 
provides an advantage over the magnetic strip configuration, particularly where 
small volumes are to be added to the sample. For example, as illustrated in 
FIG. 2, where magnetic strips 202 are used with a multiwell microtiter plate 
204, the magnet strips are offset from the center of the sample wells 206, and 

30 magnetic beads 208 concentrate along the sides of the wells. 

It is desirable that all beads be concentrated in a location such that added 
liquid makes maximum contact with the samples. If, for example, a volume of 
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sample is removed from the wells and a smaller volume is to be subsequently 
added, the smaller volume might not be sufficient to wash all the beads from the 
side of the wells, and the sample concentration could be affected. FIG. 3 is a 
plan view of the alternative, preferred embodiment, and shows a portion of the 
5 construction that centers a disk-shaped point magnet 302 beneath the center of 
each sample well in a multiwell microtiter plate. For simplicity of illustration, 
only a 4X5 grid is shown. It should be apparent that by using individual point 
magnets at the bottom of the wells, the beads collect at the bottom of the wells 
and are more easily resuspended, particularly where a smaller volume of liquid is 

10 to be added. Multiple rounds of liquid handling are employed to allow for 

supernatant removal, denaturation of double stranded DNA, wash steps and the 
addition of enzymatic reaction reagents (PROBE). 

Returning to FIG. 1, a sample plate 176 is next moved by the robotic 
system to the lid park station 158, and sealed with a lid. This operation is 

15 optional and is used, for example, when the sample is subjected to high 

temperatures in order to prevent evaporation. The sample plate can otherwise 
remain open to the environment. 

The robot 150 moves the sample plate again to the PCR station 162 and 
places it into a thermocycler of the PCR station. The thermocycler carries out an 

20 enzymatic reaction. The enzymatic reaction can be, for example, PROBE, nested 
PCR, primer extension, or sequencing reactions (e.g. Sanger). Details for such 
enzymatic reactions are commonly known to those skilled in the art. 

After the reaction is complete, the sample plate is removed from the 
thermocycler of the PCR station 162 and then is returned to the lid park station 

25 158 by the robot 150, and the lids are removed and the plate unsealed. 

The sample plates are again moved to the liquid handling and mixing 
station 170 containing the magnetic lift station 172, which applies the magnets, 
immobilizing the beads and DNA. The liquid handling and mixing station then 
removes the supernatant. The magnets are then released and liquid is added to 

30 the wells. Multiple rounds of liquid handling are employed to allow for washing 
steps or treatment with ammonium citrate, TRIS, or any other reagent that 
removes salt ions and replaces them with ammonium ions, thereby conditioning 
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the samples prior to mass spectrometry. Once conditioned, the primer extension 
product is denatured from the immobilized DNA with ammonium hydroxide and 
released into the supernatant. The ammonium hydroxide reaction is performed 
for five minutes at approximately 60° F. The supernatant is removed to a clean 
5 sample plate and placed on a shaker 168. 

The sample plate is next transported to a sample preparation station 178 
to prepare it for analysis. In a preferred embodiment, where MALDI-TOF mass 
spectral analysis is performed, nanoliter or smaller volumes of sample are 
dispensed onto pre-made silicon chips to form a microarray and reacted with 
10 matrix. In general, however, the sample may involve any preparation for use 
with any analytical method. Nanoliter or smaller volumes are dispensed using a 
nanoliter or lower dispensing apparatus, such as a piezoelectric pipette, including 
the "Nano-Plotter" station, available from GeSiM. Finally, the sample plate is 
transported to the analytical system, e.g. , a mass spectrometer or other 
15 spectrometric techniques, such as UVA/IS, IR, fluorescence, chemiluminescence 
or NMR spectrometry, where sample analysis is performed. 

Several alternatives are possible for preparing a sample for analysis and 
loading the sample into the analytical system. For example, three separate 
components, including a dispensing apparatus, a sample platform containing test 
20 samples, and an analytical instrument, can be integrated into the APL system. 

In a preferred embodiment, a nanoliter dispensing apparatus f'Nano- 
Plotter") 180 of the sample preparation station 178 is used to prepare one or 
more samples for mass spectral (MS) analysis, preferably using MALDI-TOF MS. 
In preparing a sample for MALDI-TOF analysis, the sample is co-crystallized with 
25 a matrix material. The sample is then loaded into a mass spectrometer 182 on a 
MS sample platform. Alternatively, the MS platform may be integrated into the 
mass spectrometer, rather than a separately-controlled component. The sample 
platform can be adapted to hold one or more sample analysis vessels, such as 
microchips. 
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In another embodiment, the APL system can carry out enzymology 
directly on the beads and can directly add matrix to the beads to analyze using 
mass spectrometry, where the DNA is ionized directly off the beads. This 
eliminates the need for a nanoliter dispensing station 178 such as the GeSiM 
5 "Nano-Plotter", rather, matrix is added with the liquid handling system 170. 

In a preferred embodiment, one or more microchips containing test 
samples are prepared by dispensing nanoliter volumes of a sample and an 
organic acid matrix onto a chip using a nanoliter dispensing apparatus 180, or a 
nanoliter dispensing apparatus, and loading the chips into a mass spectrometer 
10 182. Alternative embodiments are possible where (1) one or more test samples, 
e.g. . on sample chips, are prepared on a sample platform on the nanoliter 
dispensing apparatus, such as the Nano-Plotter, and the sample platform is then 
transferred, e.g. , by a robot, into the mass spectrometer; or (2) where one or 
more sample chips are prepared on the nanoliter dispensing apparatus, 
15 transferred to a mass spectrometer sample platform station 184 and then 
inserted into the mass spectrometer. 

In another embodiment, the APL system can carry out enzymology 
directly on a microchip by performing the steps of: 

1 . Aliquot genomic DNA and transfer to second chamber via taxi; 
20 2. PCR amplify the genomic DNA using previously described steps; 

3. Using a liquid handling apparatus (Tecan or GeSim) or pintool add 
DNA to microchip. The chips are held in a holder that can be 
manipulated by the robot; 

4. Add PCR reaction mix to chip; 
25 5. Incubate on thermocycler; 

6. Wash chip with liquid handling apparatus; 

7. Add matrix to chip; 

8. Load chip in MALDI; and 

9. lonization/Desorption directly from the chip via MALDI. 
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Mass Spectrometer Interface 

The nanoliter dispensing apparatus and mass spectrometer are integrated 
into the system 100 and communicate with each other, either directly or via a 
control computer. For example, in one embodiment, commands are 
automatically executed from a computer controller to initiate opening and closing 
of a mass spectrometer entry door (e.g. , by using pneumatics or a motor-driven 
mechanism) and to initiate loading of a MS sample platform into the 
spectrometer (e.g., by using a robotic arm), where the platform is either loaded 
with sample chips directly on a nanoliter dispensing apparatus, such as the 
Nano-Plotter 180, or the sample chips are prepared on a nano-plotter 180 and 
then are transferred onto a sample platform 184. FIG. 4 shows one 
implementation of the robotic interface between the nanoliter dispensing 
apparatus and the mass spectrometer illustrated in FIG. 1. 

In the FIG. 4 embodiment, the samples are automatically transported from 
the sample preparation station 178 to the mass spectrometer 182 by a robotic 
arm system 410 (not shown in FIG. 1). As described above, the samples are 
prepared for the mass spectrometer 182 in the nanoliter dispensing apparatus 
180 and/or the sample platform station 184. When preparation is complete, an 
arm 412 rotates about a pivot base 414 to pick up the samples from the sample 
preparation station and then positions them at a sample entry station 416 of the 
mass spectrometer. 

Data Analysis 

Conventionally, the output of mass spectrometer testing is analyzed by 
an individual datum-by-datum, so that an individual examines the output of a 
sample test and makes a conclusion about the test, sample-by-sample. In the 
Automated Process Line (APL) described above, the volume of test results is 
sufficiently large that any individual analyzing the mass spectrometer output 
would quickly be unable to keep up with the APL output pace. The APL system 
of the preferred embodiment performs computer-automated analysis of mass 
spectrometer output data to determine genotype or make another analysis as 
quickly as the system produces test results. The data analysis can continue as 
long as the system is in operation, including on a round-the-clock, 24-hour basis. 
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The APL system performs the test output analysis by automatically processing 
the mass spectrum output data of a sample, comparing the output data against 
expected spectrum output values for different genotypes, producing a conclusion 
about the sample genotype based on a conclusion about most likely genotype for 
5 the sample, and continuing with the output data of the next sample. 

In the preferred embodiment illustrated in FIG. 1, the data analysis is 
performed by a dedicated data analysis computer 188 that receives output data 
from the mass spectrometer 182 and any other pertinent APL stations or 
components. The data analysis computer can comprise any commercially 

10 available desktop computer, and can have the same configuration and 

components as the Clean Room control computer 124 described above. Thus, 
the data analysis computer 188 includes a CPU having an operating environment 
in which programs are executed, and also includes an operator interface with a 
keyboard and a display. 

1 5 The process line 1 00 operates continuously until a stop command is 

received, for a high sample throughout. Therefore, the process line provides for 
emergency situations where an immediate halt is required by providing halt 
switches 198 placed around the line. The system also can be halted by a 
software halt command that is input by an operator at any of the control 

20 computers 124, 131, 161, 188. The sample preparation, testing, and data 
analysis otherwise continues unimpeded. 

A visual display of the data analysis is depicted in FIG. 5, which shows 
from top to bottom: a graph of two exemplary test spectra against which output 
data will be compared; a graph of output data picked peaks for analysis; and a 

25 graph of smoothed spectrum data. Those skilled in the art will appreciate that 
the spectra shown in FIG. 5 correspond to multiple graphs of mass spectrometer 
output, wherein the horizontal axis (x-axis) units are in mass per unit charge, 
also referred to as units of Daltons, and the vertical axis (y-axis) is in relative 
intensity of spectrometer discharge. 

30 The exemplary spectra shown in FIG. 5 relate to male-female genotypes, 

but those skilled in the art will appreciate that any other paired-outcome typing 
decisions may be the subject of the sample analysis. 
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In FIG. 5, the first test spectra is labeled "Test— Female" and corresponds 
to output spectra that might be expected from a female test subject. The 
second test spectra is labeled "Test— Male" and corresponds to output spectra 
that might be expected from a male test subject. Thus, the object of the APL 
5 processing will be to determine whether a given sample genotype belongs to a 
female subject or a male subject. The "Picked Peaks" of FIG. 5 spectra is a 
display of the mass spectrometer output for a particular sample over a 
predetermined range, to show particular output peaks. The output peaks shown 
in the Picked Peaks graph are selected by the APL system based on input 

10 parameters supplied by the APL operator, as described further below. The 
bottom spectra of FIG. 5 is a display of the spectra output after correction 
processing initiated by the APL system. It should be understood that the Test- 
Female and Test— Male graphs of the FIG. 5 display will not change as the APL 
system processes the mass spectrometer output data, while the Picked Peaks 

15 and Smoothed Spectrum graphs are different for each sample data, and therefore 
will generally change with each sample being processed. It also should be 
understood that the Picked Peaks and Smoothed Spectrum displays can be 
stopped on any one of the output graphs, if the operator wants to view one 
particular set of graphs. FIG. 6 is a flow diagram of the operating steps 

20 performed by the APL system in carrying out the mass spectrometer data 
analysis, and will be best understood with reference to the FIG. 5 graphs. 

The first data analysis step, represented in FIG. 6 by the flow diagram 
box numbered 602, is to receive test run input parameters. These are 
parameters that the APL system will receive from an operator and will apply in 

25 processing a run of mass spectrometer output data. That is, the APL system 
will use the test run input parameters to evaluate test samples until the test run 
parameters are changed by the APL operator. As noted above, a test run might 
involve producing mass spectrometer output and analyzing it on a 24-hours-per- 
day basis. In the preferred embodiment, the operator provides the test run 

30 parameters through a graphical user interface using a display mouse and 

keyboard of the APL system. The test run input parameters received from the 
operator will include the x-axis range in Daltons for the spectrometer output data 
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and x-axis locations of expected peaks that are picked for data identification and 
genotype evaluation. The input parameters will also include an expected 
baseline value, defining a noise floor above which data should comprise a peak. 
In the next processing step, represented by the FIG. 6 flow diagram box 
5 numbered 604, test data is received for a particular test sample submitted to the 
mass spectrometer of the APL system. A particular test sample may be one well 
in a 96-well-by-96-well tray, for example. Other tray sizes may be 
accommodated by the APL. 

Those skilled in the art will understand that a mass spectrometer 

10 bombards a crystalline-based sample with energy until the sample vaporizes and 
output products are produced. The output products include intact sample 
particles that are ionized and projected outwardly to different distances from the 
sample center. The mass spectrometer detects the distribution of output 
products having a particular mass per unit charge and assigns a relative intensity 

15 to those output products. The mass/charge units are given in Daltons or 

kiloDaltons (kD). Thus, the mass spectrometer output for a given sample is a 
sequence of paired numbers, or x-y values, that specify the detected 
mass/charge over a range of Daltons (x-axis) and the corresponding relative 
intensity (y-axis) distribution over that range. 

20 For each set of sample data that is processed, the APL system removes 

the residual baseline. This processing is represented by the FIG. 6 flow diagram 
box numbered 606, and allows for a rolling baseline that might otherwise skew 
the output data. More particularly, with current processing systems, it is 
possible to misinterpret peaks or spikes, such as where true data peaks are 

25 located in valleys. Conventional programs identify peaks by detecting data 
intensity values (see FIG. 5) that are greater than a baseline value. The data, 
however, can contain localized areas in which a peak lies within a valley of a 
plateau area having an elevated baseline. Peaks that are in such valleys may be 
missed by conventional programs that do not detect a sufficient difference 

30 between the peak height relative to the plateau level. It has been found that 

such conventional programs may correctly identify peaks up to 80% of the time, 
but cannot generally provide greater accuracy due to missed peaks. 
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To remove the residual baseline and increase accuracy, the APL data 
analysis receives the input parameters that contain the operator's specification 
of where the peaks in the sample experimental results should be located in the 
mass spectrometer output. The APL system then examines the output data 
5 where there should be no peaks to find the true baseline value. The processing 
represented by the FIG. 6 flow diagram box numbered 606 therefore includes 
modeling the baseline of the mass spectrometer output with a quadratic 
equation, based on the test run inputs from the operator. It has been found that 
a quadratic equation is superior to using a cubic equation, and also closer than a 
10 lower-order fit, even though very small coefficients are expected for the baseline 
curve fit. 

For example, the range of interest might be mass spectrometer output 
over the range of 4000 to 9000 Daftons. The maximum range and minimum 
range would be received as test run inputs. In addition, the expected peaks for 

15 the sample experimental data over that range of interest would be received as 
test run inputs. The data concerning expected peaks should include the peaks 
that will be produced given the data types for which there is testing, and also 
peaks expected in the output as a result of primer substances in the sample. 
Thus, the range of interest should include output artifacts from primer sources. 

20 These primer output artifacts can serve as landmarks to identify any output 

shifting. In addition to the locations of the expected peaks, the APL system also 
receives the peak width as in input test run parameter. The APL system 
assumes that peaks will be distributed as a gaussian curve, and the peak width 
input parameter indicates the approximate width for each of those peaks. In the 

25 preferred embodiment, there is one input for all peaks. For example, all peaks 
may be specified as having a width of 10 Daltons (ten x-axis units). 

Next, with the test run parameters that specify the range of interest and 
the location of peaks, the APL system will identify peak-free regions in the mass 
spectrometer output of each sample that correspond to the range of interest, 

30 with the data at the peaks removed. For example, suppose there are two peaks 
of interest expected in the output that will identify a sample as being one 
genotype or another. Suppose also that there is an additional peak expected in 
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the output, for primer output artifacts. Therefore, a total of three peaks will be 
expected in the mass spectrometer output over the range of interest. Then the 
peak-free regions would be those regions in the output data along the x-axis over 
the range of interest, with the data at the three identified peaks deleted. As 
5 noted above, the peaks are assumed to be gaussian, with a width value 

specified in the input parameters. Therefore, the data for deletion comprises the 
peaks identified in the test run input parameters and also an area two peak 
widths wide on either side of each identified peak (peak midline, + /- two peak 
widths). 

10 It is the mass spectrometer output data with the peaks deleted that gives 

the peak-free region, to which the quadratic equation is fitted. Typically, the 
variable quadratic coefficients would be small, but it is possible to get 
contamination from the lower-mass sample particles, which can skew the 
output. If such contamination is present in the output, then the sample output 

1 5 may be skewed so that the peak free regions will be best modeled by a quadratic 
equation. It has been found that contamination products are best modeled with 
a quadratic equation, rather than a linear, cubic, or other type of equation. 

The technique for determining the coefficients of the quadratic equation 
for the best fit to a peak-free baseline is preferably a least squares fit technique, 

20 which will be well-known to those skilled in the art. In particular, error 

minimization using gradient information has been found suitable for the least 
squares fit. Thus, the curve-fit quadratic baseline equation can be used to 
produce an expected baseline over the mass spectrometer output range of 
interest. Therefore, as part of the baseline correction processing represented by 

25 the FIG. 6 flow diagram box numbered 606, at each data point interval along the 
range of interest (e.g., from 4000 to 9000 Daltons), the curve-fit baseline 
equation is used to calculate a corrected baseline value, which is subtracted 
from the sample data. The baseline correction occurs over the entire data range, 
including at the peaks. This produces a new set of baseline-corrected sample 

30 data values, i.e., a baseline-corrected output spectrum. 

In the next processing step, represented by the FIG. 6 flow diagram box 
numbered 608, a curve is fit to each baseline-corrected peak value in the mass 
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spectrometer output data. In the preferred embodiment, a standard curve fitting 
algorithm is used, such as the Marquardt-Levenberg algorithm. This fits a 
gaussian curve to each possible baseline-corrected output peak position. Those 
skilled in the art will understand that the output of such curve fitting will provide 
5 coefficients of a gaussian distribution centered at each peak that will match the 
height of the baseline-corrected output data at that peak, and will also provide 
the covariance of the curve-fit height. Thus, the box 308 curve fitting will 
provide, for each peak, equation coefficients that give a peak height and a 
covariance for the equation at that peak. 

10 In a preferred embodiment, the "Picked Peaks" graph in RG. 5 represents 

all peaks in the mass spectrometer output that have a height that exceeds the 
baseline corrected value generated by the box 606 processing, using peaks that 
are modeled from the box 608 processing. Alternatively, the Picked Peaks graph 
may represent the peaks in the actual mass spectrometer output that exceed an 

15 input threshold value. This latter type of Picked Peaks graph display is the type 
that is typically provided by mass spectrometer manufacturers, such as 
Bruker-Franzen Analytik GmbH ("Bruker") of Germany. In the preferred 
embodiment, the "Smoothed Spectrum" graph of FIG. 5 represents the output 
from the mass spectrometer with default data processing, which may include 

20 curve smoothing or other data processing provided by the mass spectrometer 
manufacturer. This type of Smoothed Spectrum graph is provided, for example, 
as standard output from the Bruker mass spectrometer. Alternatively, the 
Smoothed Spectrum graph may represent the mass spectrometer output with the 
baseline threshold parameter subtracted, or the actual mass spectrometer output 

25 with the quadratic-fit baseline curve subtracted. 

In the next processing step, represented by the FIG. 6 flow diagram box 
numbered 610, the APL system determines the probability that the output data 
at each identified peak location is a valid peak. In the preferred embodiment, the 
peak validation decision is made by comparing probability density functions 

30 (PDF) for the peak-free region and for the fitted peak by constructing gaussian 
(or normal) probability curves and comparing them to determine if the data 
overlaps. If the two curves (the fitted peak and the peak-free region) are 



WO 00/60361 



PCT/US00/08111 



-38- 

substantially free of any overlap, then the APL assumes that a true peak has 
been identified. Otherwise, the fitted "peak" is considered a spurious datum in 
the noise of the mass spectrometer output. 

More particularly, the PDF of the peak-free region is assumed to be a 
5 gaussian distribution. The mean height and the standard deviation are 

determined by the mass spectrometer output for the sample in question. The 
PDF at each identified peak location is assumed to be a gaussian distribution 
with the mean height and the standard deviation given by the curve fitting 
algorithm described in box 308. The second gaussian curve will be determined 

10 once for each peak. The degree to which the two curves resemble each other is 
compared statistically using hypothesis testing that will be well-known to those 
skilled in the art. The output of the hypothesis test will be a probability value 
(from zero to one) that characterizes the peak under consideration. Thus, each 
peak is assumed to be an independent statistical event. 

15 For example, the comparison uses the baseline curve, which is a 

quadratic model (peak-free region) having a particular mean height and 
corresponding standard deviation. The comparison also uses the gaussian model 
of each peak, having a mean height and standard deviation. If the mean values 
of the two respective curves are different by more than two standard deviations, 

20 then it is assumed there is no overlap for purposes of peak validation. That is, 
the test peak is a valid peak. If the two curves are not different in mean by 
more than two standard deviations, then the identified peak is not a valid peak, 
but is part of the output noise. 

After the APL system evaluates the probability for all of the peaks, it will 

25 know the number of peaks that have been identified as valid. The system then 
determines probabilities for the genotypes under consideration. The APL system 
makes a data typing decision based on the presence or absence of sufficient true 
or validated peaks to indicate one genotype or the other. This processing is 
indicated in FIG. 6 by the flow diagram box numbered 612, and is carried out in 

30 a probabilistic manner. 

For example, suppose a sample is to be typed as either female or male, 
and a female is indicated by the presence of an output peak at a position "A" 
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and the absence of an output peak at a position W B W , while a male is indicated by 
a peak at position "A" and also at position "B\ Then the probability of a sample 
being female is the product of the probability of a true peak occurring at A and 
the probability of a peak not occurring at B. Stated in equation form: 
5 P(female) = P(A) • (1 - P<B)). 

The probability of a sample being male is then the product of the probability of a 
true peak occurring at position A and the probability of a true peak occurring at 
position B, given by the equation: 

P(male) = P(A) # P{B). 

10 This analysis is performed automatically by the APL system for each of the 

samples processed by the mass spectrometer. Based on these probabilities, the 
APL system decides whether the mass spectrometer output identifies a male or a 
female. If the probabilities indicate an ambiguous outcome, then the mass 
spectrometer output is considered inconclusive. In the preferred embodiment, a 

15 probability is considered conclusive if it is at least ten times the probability of the 
alternative outcome. Thus, if P(female) is greater than ten times P(male), then 
the typing decision is for a female. If P(male) > 10 * P(fernale), then the typing 
decision is for a male. 

After the analysis has been performed for a sample subject, the APL 

20 system checks for additional mass spectrometer output for analysis. As noted 
above, the APL system can support mass spectrometer output at the rate of 
hundreds of output sets per hour. As indicated by the decision box 614 in FIG. 
6, if more data is present, an affirmative outcome at box 614, then APL control 
resumes with receiving the next set of output data at the flow diagram box 

25 numbered 304. If there is no more mass spectrometer output data for analysis 
or if a system operator indicates a halt command, a negative outcome at box 
614, then the sample run ends and other operation of the APL continues. For 
example, operation may return to box 602, where more test run input 
parameters are received and output analysis is resumed. Other processing may 

30 occur, as desired. 
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Databases 

In cases of high-throughput, the system stores results of all samples in all 
runs in a database. The sample run history may be selected for viewing through 
a user interface, such as, but but are not limited to, that illustrated in FIG. 7. 
5 The user interface permits review of the database created by one or more 

sample runs. An example of the user interface to such a database is shown in 
the screen display of FIG. 8. The database provides a means of obtaining test 
output, reaction details, sample details, and assay details for each sample under 
test. For example, shown as output collected in the database are the sample 

10 plate number, location of the sample well, sample and plate IDs, name, result of 
genotype matching, and actual spectrum for each sample. 

A database analysis system is also integrated into the APL system (see 
FIG. 7) and permits a user to (1) create a new run; (2) copy an existing run; (3) 
edit or view an existing run; (4) change status or add comment; (5) view the 

15 history of a run; and (6) create or edit an assay or test. In the preferred 

embodiment, the database is supported by a database management system from 
Oracle Corporation. 

The systems may also draw from data available in databases or collected 
in databases. Of interest and exemplary, though not limiting, thereof, is the 

20 healthy patient database described in copending U.S. provisional application 
Serial No. 60/159,176, filed October 13, 1999. 

The processes, systems, and products provided herein have been 
described above in terms of a presently preferred embodiment. There are, 
however, many configurations for automated high throughput systems that 

25 include processing stations not specifically described herein but that are 

apparent from the disclosure herein. The disclosure herein is not limited to the 
particular embodiments described herein, but rather, is understood to have wide 
applicability with respect to automated process lines generally, particularly in the 
areas of diagnostics and high throughput screening protocols. 

30 Since modifications will be apparent to those of skill in this art, it is intended 
that this invention be limited only by the scope of the appended claims. 
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CLAIMS: 

1 . An automated system for high throughput processing of biological 
samples, the system comprising: 

a plurality of processing stations, each of which performs a procedure on 
5 a biological sample contained in a reaction vessel, wherein one of the processing 
stations comprises a mass spectrometer; 

a robotic system that transports the reaction vessel from processing 

station to processing station; 
a data analysis system that receives test results of the processing 
10 stations and automatically processes the test results to make a 

determination regarding the biological sample in the reaction * 
vessel; and 

a control system that determines when the test at each processing 

station is complete and, in response, moves the reaction vessel to 
15 the next test station, and continuously processes reaction vessels 

one after another until the control system receives a stop 
instruction. 

2. A system of claim 1 , wherein the reaction vessel comprises a 
multiple-well substrate or a solid support or both. 

20 3. A system of claim 1 or claim 2, further including a mass 

spectrometer interface that automatically transfers samples into the mass 
spectrometer for processing. 

4. A system of any of claims 1-3, wherein the data analysis system 
processes the test results from the mass spectrometer and makes a data typing 

25 decision regarding the biological sample. 

5. A system of any of claims 1-4, wherein the data analysis system 
processes the test results by receiving test data from the mass spectrometer 
such that the test data for a biological sample contains one or more peaks, 
whereupon the data analysis system removes a residual baseline from the test 

30 data for a biological sample, curve fits each peak of the biological sample test 
data to predetermined input parameters, determines a probability that each peak 
of the biological sample test data is a valid peak, and makes a data typing 
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decision regarding the biological sample in accordance with the determined valid 
peaks. 

6. A system of any of claims 1-5, wherein the data analysis system 
displays exemplary test spectra for data types determined by the data analysis 
system, along with a graph of test data picked peaks and a graph of smoothed 
test spectra data for a biological sample. 

7. A system of any of claims 1-6, wherein the data analysis system 
receives test run input parameters that determine processing until a different set 
of input parameters is received. 

8. A system of any of claims 1-7, wherein the data analysis system 
removes the residual baseline from the test data by modeling the baseline of the 
mass spectrometer data with a quadratic equation specified by the input 
parameters. 

9. A system of claim 8, wherein the input parameters specify a range 
of data over which the baseline is modeled. 

10. A system of claim 9 f wherein the baseline is modeled over a peak 
free region specified by the input parameters. 

11. A system of claim 7 or claim 8, wherein the picked peaks graph 
represents all peaks in the mass spectrometer output that have a height that 
exceeds the residual baseline corrected data. 

12. A system of claim 11, wherein the data analysis system validates 
a peak after comparing a probability density function for the peak free region 
with a probability density function for a fitted peak if the comparison shows that 
the respective probability density functions overlap by a predetermined amount. 

13. A system of any of claims 1-12, wherein the process line includes 
a contamination-controlled environment and a non-sterile environment, each 
environment containing one or more processing stations. 

14. An automated system for high throughput processing of biological 
samples, the system comprising: 

a process line comprising a plurality of processing stations, each of which 
performs a procedure on a biological sample contained in or on a 
reaction vessel; 
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a robotic system that transports the reaction vessel from processing 

station to processing station; 
a data analysis system that receives test results of the process line and 

automatically processes the test results to make a determination 
5 regarding the biological sample in the reaction vessel; 

a control system that determines when the test at each processing 

station is complete and, in response, moves the reaction vessel to 

the next test station, and continuously processes reaction vessels 

one after another until the control system receives a stop 
10 instruction; and 

a contamination-controlled environment and a non-sterile 

environment, each environment containing one or more processing 

stations. 

15. The system of any of claim 13 or claim 14, further comprising a 
15 taxicab that automatically transports samples between the two environments. 

16. A method for high throughput processing of biological samples, 
the method comprising: 

transporting a reaction vessel along a process line of the system of any of 

claims 1-15 having a plurality of processing stations, each of 
20 which performs a procedure on one or more biological samples 

contained in the reaction vessel; 
determining when the test procedure at each processing station is 

complete and, in response, moving the reaction vessel to the next 

processing station; 

25 receiving test results of the process line and automatically processing the 

test results to make a data analysis determination regarding the 
biological samples in the reaction vessel; and 
processing reaction vessels continuously one after another until receiving 
a stop instruction. 

30 17. The method of claim 16, wherein the step of transporting includes 

automatically transferring samples into a mass spectrometer for processing using 
a robotic mass spectrometer interface. 
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18. A method of claim 16, wherein the step of receiving test results 
comprises: 

receiving test data from the mass spectrometer such that the test data for 
a biological sample contains one or more peaks; 

removing a residual baseline from the test data for a biological sample; 

curve fitting each peak of the biological sample test data to 
predetermined input parameters; 

determining a probability that each peak of the biological sample test data 
is a valid peak; and 

making a data typing decision regarding the biological sample in 
accordance with the determined valid peaks. 

19. A method of claim 16, further including the step of displaying 
exemplary test spectra for data types to be determined by the data analysis 
system, along with a graph of test data picked peaks and a graph of smoothed 
test spectra data for a biological sample. 

20. A method of claim 1 6, wherein the data analysis system receives 
test run input parameters that determine processing until a different set of input 
parameters are received. 

21 . A method of claim 20, wherein the step of displaying comprises 
displaying exemplary test spectra for data types to be determined by the data 
analysis system, along with a graph of test 

data picked peaks and a graph of smoothed test spectra data for a biological 
sample, and the input parameters specify display parameters. 

22. A method of claim 1 6, wherein the step of removing residual 
baseline from the test data comprises modeling the baseline of the mass 
spectrometer data with a quadratic equation specified by the input parameters. 

23. A method of claim 22, wherein the input parameters specify a 
range of data over which the baseline will be modeled. 

24. A method of claim 23, wherein the baseline is modeled over a 
peak free region specified by the input parameters. 
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25. A method of claim 21, wherein the picked peaks graph represents 
all peaks in the mass spectrometer output that have a height that exceeds the 
residual baseline corrected data. 

26. A method of claim 25, wherein the data analysis system validates 
5 a peak after comparing a probability density function for the peak free region 

with a probability density function for a fitted peak if the comparison shows that 
the respective probability density functions overlap by a predetermined amount. 

27. A method of claim 16, wherein the process line includes a 
contamination-controlled environment and a non-sterile environment, and the 

10 step of transporting includes automatically transporting samples between the 
two environments in a sterile taxicab. 

28. A data analysis system comprising: 

a computer having an operating environment that executes a data 

analysis program for processing test results from a process line of 
15 any of claims 1-15 haing a plurality of processing stations, each of 

which performs a procedure on a biological sample contained in or 
on a reaction vessel; and 
a computer interface that receives the test results from the process line 
and provides the test results to the data analysis program; 
20 wherein the data analysis program automatically processes the test results to 
make a determination regarding the biological sample in the reaction vessel, and 
continuously performs such processing for biological samples until a stop 
instruction is received. 

29. A data analysis system of claim 28, wherein the data analysis 
25 system processes the test results by receiving test data from the mass 

spectrometer such that the test data for a biological sample contains one or 
more peaks, whereupon the data analysis system removes a residual baseline 
from the test data for a biological sample, curve fits each peak of the biological 
sample test data to predetermined input parameters, determines a probability 
30 that each peak of the biological sample test data is a valid peak, and makes a 
data typing decision regarding the biological sample in accordance with the 
determined valid peaks. 
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30. A data analysis system of claim 28 or claim 29, wherein the data 
analysis system displays exemplary test spectra for data types to be determined 
by the data analysis system, along with a graph of test data picked peaks and a 
graph of smoothed test spectra data for a biological sample. 
5 31 . A data analysis system of any of claim 28 or claim 29, wherein 

the data analysis system receives test run input parameters that determine 
processing until a different set of input parameters are received. 

32. A data analysis system of claim 31, wherein the data analysis 
system displays exemplary test spectra for data types to be determined by the 

10 data analysis system, along with a graph of test data picked peaks and a graph 
of smoothed test spectra data for a biological sample, and the input parameters 
specify display parameters. 

33. A data analysis system of claim 28, wherein the data analysis 
system removes the residual baseline from the test data by modeling the 

15 baseline of the mass spectrometer data with a quadratic equation specified by 
the input parameters. 

34. A data analysis system of claim 33, wherein the input parameters 
specify a range of data over which the baseline will be modeled. 

35. A data analysis system of claim 34, wherein the baseline is 
20 modeled over a peak free region specified by the input parameters. 

36. A data analysis system of claim 35, wherein the picked peaks 
graph represents all peaks in the mass spectrometer output that have a height 
that exceeds the residual baseline corrected data. 

37. A data analysis system of claim 36, wherein the data analysis 
25 system validates a peak after comparing a probability density function for the 

peak free region with a probability density function for a fitted peak if the 
comparison shows that the respective probability density functions overlap by a 
predetermined amount. 

38. A method for high throughput processing of biological samples, 
30 the method comprising: 

transporting a reaction vessel along a process line having a processing 
station that performs a mass spectrometer test procedure on one 
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or more biological samples contained in the reaction vessel; 
providing the reaction vessel to the mass spectrometer and performing 

the mass spectrometer test; and 
continuously providing reaction vessels to the mass spectrometer and 
5 receiving test results of the mass spectrometer and automatically 

processing the test results to make a determination regarding a 
characteristic of the biological samples in the reaction vessel, 
wherein the characteristic is the biological sample genotype. 

39. A method of claim 38, wherein the reaction vessel comprises a 
10 multiple-well sample tray or solid support 

40. A method of claim 38 or 39, wherein the step of continuously 
providing reaction vessels to the mass spectrometer comprises automatically 
transferring samples into the mass spectrometer for processing using a robotic 
mass spectrometer interface. 

15 41. A method of any of claims 38-40, wherein the step of receiving 

test results comprises: 

receiving test data from the mass spectrometer such that the test data for 

a biological sample contains one or more peaks; 
removing a residual baseline from the test data for a biological sample; 
20 curve fitting each peak of the biological sample test data to 

predetermined input parameters; 
determining a probability that each peak of the biological sample test data 

is a valid peak; and 
making a data typing decision regarding the biological sample in 
25 accordance with the determined valid peaks. 

42. A method of any of claims 38-41 , further including the step of 
displaying exemplary test spectra for data types to be determined by the data 
analysis system, along with a graph of test data picked peaks and a graph of 
smoothed test spectra data for a biological sample. 
30 43. A method of any of claims 38-42, wherein the data analysis 

system receives test run input parameters that determine processing until a 
different set of input parameters are received. 
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44. A method of claim 43, wherein the step of displaying comprises 
displaying exemplary test spectra for data types to be determined by the data 
analysis system, along with a graph of test data picked peaks and a graph of 
smoothed test spectra data for a biological sample, and the input parameters 

5 specify display parameters. 

45. A method of claim 41, wherein the step of removing the residual 
baseline from the test data by modeling the baseline of the mass spectrometer 
data with a quadratic equation specified by the input parameters. 

46. A method of claim 45, wherein the input parameters specify a 
10 range of data over which the baseline will be modeled. 

47. A method of claim 46, wherein the baseline is modeled over a 
peak free region specified by the input parameters. 

48. A method of claim 44, wherein the picked peaks graph represents 
all peaks in the mass spectrometer output that have a height that exceeds the 

15 residual baseline corrected data. 

49. A method of claim 48, wherein the data analysis system validates 
a peak after comparing a probability density function for the peak free region 
with a probability density function for a fitted peak if the comparison shows that 
the respective probability density functions overlap by a predetermined amount. 

20 50. A method of claim 38-49, wherein the process line includes a 

contamination-controlled environment and a non-sterile environment. 

5 1 . The method of claim 50, wherein the step of transporting includes 
automatically transporting samples between the two environments in a sterile 
taxicab. 

25 52. The system of any of claims 13-15 that occupies two rooms, 

wherein the components in each room are linked by an automated sample 
transporter. 

53. The system of claim 52, wherein one room is a clean room. 
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